Development and validation of a machine learning-based 18F-fluorodeoxyglucose PET/CT radiomics signature for predicting gastric cancer survival

Background Survival prognosis of patients with gastric cancer (GC) often influences physicians’ choice of their follow-up treatment. This study aimed to develop a positron emission tomography (PET)-based radiomics model combined with clinical tumor-node-metastasis (TNM) staging to predict overall survival (OS) in patients with GC. Methods We reviewed the clinical information of a total of 327 patients with pathological confirmation of GC undergoing 18 F-fluorodeoxyglucose (18 F-FDG) PET scans. The patients were randomly classified into training (n = 229) and validation (n = 98) cohorts. We extracted 171 PET radiomics features from the PET images and determined the PET radiomics scores (RS) using the least absolute shrinkage and selection operator (LASSO) and random survival forest (RSF). A radiomics model, including PET RS and clinical TNM staging, was constructed to predict the OS of patients with GC. This model was evaluated for discrimination, calibration, and clinical usefulness. Results On multivariate COX regression analysis, the difference between age, carcinoembryonic antigen (CEA), clinical TNM, and PET RS in GC patients was statistically significant (p < 0.05). A radiomics model was developed based on the results of COX regression. The model had the Harrell’s concordance index (C-index) of 0.817 in the training cohort and 0.707 in the validation cohort and performed better than a single clinical model and a model with clinical features combined with clinical TNM staging. Further analyses showed higher PET RS in patients who were older (p < 0.001) and those who had elevated CEA (p < 0.001) and higher clinical TNM (p < 0.001). At different clinical TNM stages, a higher PET RS was associated with a worse survival prognosis. Conclusions Radiomics models based on PET RS, clinical TNM, and clinical features may provide new tools for predicting OS in patients with GC.


Introduction
GC is the fifth most common cancer in the world, and East Asia continues to have a high incidence of the disease [1].GC is usually asymptomatic in its early stages.hence, it often remains undiagnosed until it reaches an advanced stage.Comprehensive surgery-based treatment remains the primary approach for advanced GC management [2].Although the 5-year overall survival rate of GC patients has improved recently and is higher than before [3,4], predictive models and scoring tools for the prognosis of patients with GC are essential to improve individualized treatment.These tools can provide clinicians with follow-up treatment options to improve patient survival. 18F-FDG PET/CT imaging is a vital tool for the characterization, staging, and detection of distant metastases in patients with malignant tumors, including GC [5][6][7].Particularly, this machine has a superior diagnostic ability for detecting distant metastases from cancer compared to computed tomography (CT) [8].Parameters such as the maximum standardized intake value (SUV), total lesion glycolysis (TLG), and metabolic tumor volume (MTV) are often used to evaluate the prognosis of GC patients [9].However, the spatial information of diagnostic images has not been fully analyzed and still relies on the rich experience and subjectivity of doctors.
Radiomics is an innovative technique that involves extracting high-dimensional information from standard medical images and delving deeply into hidden information regarding potential diseases that may not be visible to the human eye [10][11][12][13].This holds tremendous potential for the diagnosis, prognostic assessment, and treatment prediction of GC, offering new opportunities in the field of precision medicine [14][15][16][17].However, the number of radiomics tools based on PET imaging to predict survival models in patients is still limited in GC compared to other types of cancers [18][19][20].
Therefore, in this study, we attempted to extract image features from PET images to establish a relevant model that could be combined with the clinical TNM staging for patients with GC to determine whether the prognosis of patients with GC can be improved.

Patient population
A retrospective review of the medical records and imaging data of patients with GC who underwent 18 F-FDG PET/CT at the First Hospital of Wenzhou Medical University between January 2012 and June 2021 was conducted.The inclusion criteria were age > 18 years, pathologically confirmed GC, and availability of complete follow-up data and clinicopathological characteristics.The exclusion criteria were GC patients who had previously received any previous anticancer treatment, patients with other tumors or serious organic diseases, and patients with incomplete clinical data or missing diagnostic images.The study ultimately recruited 327 cancer patients who were randomized 7:3 into the training and validation cohorts.A flowchart of patient screening is shown in Fig. 1.Clinical and pathological data of the patients, including sex, age, Nutritional Risk Screening 2002 (NRS 2002), body mass index (BMI), chemotherapy, CEA, Carbohydrate antigen199 (CA199), surgery, and clinical TNM staging, were retrospectively collected from medical records.Clinical TNM staging was determined by a radiologist and general surgeon according to the 8th edition of the American Joint Committee on Cancer staging system [21].Each patient was followed up regularly.During the first 2 years, patients were monitored every 3 months and then every 6 months through outpatient treatment.OS was defined as the time to death from any cause and was used as the endpoint.

PET/CT image acquisition
GC patients who underwent 18 F-FDG PET/CT were imaged after a 6-hour fast, and blood glucose levels were maintained below 110 ml/dl.Patients were injected intravenously with 18F-FDG (3.7 MBq/kg) and imaged 60 min later using a 18 F-FDG PET/CT scanner.(GEMINI TF 64, Philips, The Netherlands).The parameter settings were as follows: matrix size, 144 × 144; slice thickness, 5 mm; field of view, 576 mm; and emission scan time, 1.5 min for each bed position.

PET/CT image segmentation and feature extraction
LIFEX software tools were used for volume of interest (VOI) delineation and feature extraction from each patient's PET images [22].Initially, a radiologist and a general surgeon drew the VOI segmentation using the Digital Imaging and Communications in Medicine protocol; then, an experienced radiologist checked this to ensure the accuracy of subsequent analyses.Two weeks later, the radiologists selected 50 patients and again segmented their VOI for assessment of VOI image quality.The LIFEX software program automatically measured the SUVmax of the segmented VOI for the target gastric lesions and selected the VOI using a 40% SUVmax threshold.Additionally, it automatically measured the MTV and TLG of target gastric lesions.
LIFEX software was used to extract 171 radiological features from tumor image VOI.The radiomics features were first-order statistics, shape-based features, graylevel co-occurrence matrix, gray-level run length matrix, gray-level size zone matrix, and neighborhood gray tone difference matrix.

Construction of PET radiomics scores
To assess the consistency of the features and feature screening, we processed the extracted features.First, two radiologists calculated intraclass correlation coefficients (ICC) for the radiomics features extracted from the segmentation of 50 patients.High consistent features were defined as those with ICC values > 0.75.All high consistent features from the patients were standardized using the mean and standard deviation with the z-score algorithm.
Next, we used the LASSO with ten-fold cross-validation for feature selection in the training cohort; the features were ranked using the RSF method based on their importance and data predictive ability to obtain radiomics scores, called PET RS.Further validation was performed on clinical data of patients in the validation cohort.Based on this model, the PET RS for each patient was calculated for the validation cohort.The median PET RS (60) of the training cohort was used to classify all patients into high-risk and low-risk groups in order to reduce the distributional differences between the training and validation cohorts.

Development and validation of predictive model
For further analysis of the clinical features and PET RS of the patients, we performed univariate and multivariate Cox regression analyses to identify prognostically relevant clinical features.Subsequently, we selected features with p < 0.05 and constructed a nomogram that included clinical features, clinical TNM staging, and PET RS to visualize the results.Additionally, we constructed two other models, one with only clinical features and the other with clinical features combined with clinical TNM staging data, and used the C-index to assess the discrimination between the three models.A decision curve analysis (DCA) was performed to assess the clinical value of the nomograms [23].The workflow is illustrated in Fig. 2.

Statistical analysis
The R (version 4.3.3,https://www.r-project.org/) was used for statistical analysis.The t-test was used for normally distributed continuous data, and the Mann-Whitney U test for non-normally distributed continuous data.The chi-square test or Fisher's exact test was used for categorical data.Independent prognostic factors that influenced outcomes were identified using univariate and multivariate Cox analyses.The Kaplan-Meier method was employed to construct survival curves, with the logrank test subsequently employed to compare differences between the two cohorts.Nomograms, calibration, and DCA plots were generated using the R package.Statistical significance was set at p < 0.05.

Patient characteristics
A total of 327 GC patients with were randomised to training (n = 229) and validation (n = 98) cohorts in a 7:3 ratio.The training cohort included 171 males and 58 females, and the validation cohort included 72 males and 26 females.Each patient was followed for at least two years, with 215 (65.7%) deaths and the median survival time of 19 months.Table 1 summarizes the detailed characteristics of the patients in the two cohorts.

Radiomics feature selection and PET radiomics scores building
Lifex software extracted 171 PET radiomics features from the PET images of each GC patient.These characteristics were analyzed according to the following steps.Initially, according to the standard that the ICC value should be greater than 0.75, 160 features with high consistency were selected for model construction.We then used LASSO analysis to obtain the optimum value and three PET radiomics features with nonzero LASSO coefficients (Fig. 2a and b).Next, the selected features were further modeled based on the optimal iteration times of the RSF (Fig. 2c).Based on the PET RS calculated using this model, each patient was classified into high-and low-PET RS groups.We conducted survival analysis using the Kaplan-Meier survival curve and log-rank test, which showed significant differences between the high-PET RS group and low-PET RS group in the training (logrank p < 0.001, Fig. 3a) and validation cohorts (log-rank p < 0.001, Fig. 3b).We further evaluated the accuracy of PET RS in the two cohorts for OS prediction using timedependent receiver operator characteristics (Fig. 3c, d). Figure 3e-h depict the correlation between PET RS and survival outcomes, including survival status and time, for each patient.

Model construction and evaluation
To comprehensively evaluate the impact of PET RS and other clinical features on prognosis, we performed a univariate Cox regression analysis in the training cohort.For features with p < 0.05 in univariate analysis, multivariate Cox regression analysis was performed (Table 2), showing that independent risk factors for OS were age, CEA, clinical TNM stage and PET RS (Fig. 4a).Additionally, to provide clinicians with a practical tool for risk assessment and therapeutic decision support, we built a nomogram based on the significant variables (age, CEA, clinical TNM stage, and PET RS) in the multivariate COX regression results (Fig. 4b).The nomogram C-index was 0.817 [95% CI: 0.790-0.844]for OS in the training cohort and 0.707 [95% CI: 0.640-0.774]for OS in the validation cohort.In both cohorts, the 1-year and 2-year nomogram calibration curves showed good agreement between the estimates and actual observations (Fig. 5a, b).Moreover, we developed two other models: a single clinical features model and a model of clinical features combined with clinical TNM staging characteristics.However, the nomogram's C-index and the integrated Brier score (IBS)  were better than those of the other two models (Table 3).
The DCA curves indicated that within a reasonable range of threshold probabilities, the nomogram provided more beneficial prognostic information for patients with GC (Fig. 5c).

Analysis of PET RS and associated clinical features
Differential analysis of PET RS with different clinical features in all patients showed that older age (p < 0.001, Fig. 6a), elevated CEA levels (p < 0.001, Fig. 6b), and more advanced clinical TNM stage (p < 0.001, Fig. 6c) were associated with higher PET RS.We performed a stratified analysis of clinical TNM in the two cohorts, and the results showed that PET RS could better differentiate the survival of patients at different stages of GC (Fig. 6d-g).

Discussion
Radiomics is a combination of noninvasive imaging and artificial intelligence technologies for applications in the diagnosis, prognosis, and individualized treatment of diseases.Jiang et al. used a CT-based radiomics model to predict disease-free survival (DFS) and OS in GC patients, and their results showed that the radiomic signal was a predictor of DFS and OS [24].Wang et al. constructed a CT-based radiomic nomogram to predict lymph nodes in GC [25].Xu et al. applied a machinelearning model with CT to predict the pathological downgrading of neoadjuvant chemotherapy in patients with advanced GC, which contributed to subsequent surgical treatments [26].However, compared to CT, PET has unique advantages in the differential diagnosis, precise staging, and distant metastasis diagnosis of GC [27], which are conducive to increasing the survival rate  Previous studies have demonstrated the potential of PET radiomics in the prediction of lymph node and peritoneal metastases [28,29].It has also been shown that PET radiomics was a predictor of OS and DFS, as well as the benefit of chemotherapy in patients [30].Our study also demonstrated that PET RS predicted OS in patients with GC and that GC patients with high PET RS had poorer prognoses for survival.Findlay et al. showed that routine staging of GC using PET/CT could detect metastases and predict early postoperative recurrence [6].Clinical TNM staging was added to take advantage of PET imaging and provide patients with more accurate clinical TNM staging.The results of this study indicate that clinical TNM staging is a reliable predictor of patient prognosis, which also provides a basis for treatment strategies for patients with advanced GC who cannot undergo  invasive surgery.We constructed a nomogram that incorporated clinical TNM, clinical features, and PET RS and compared it with a single clinical feature model and a clinical feature combined with a clinical TNM model.These results suggest that PET RS can provide additional information to a certain extent, which can be obtained from the quantitative assessment of heterogeneity using the radiological features of PET [31].This may be related to the effect of tumor heterogeneity on survival and prognosis [32,33].In other words, PET RS can compensate for the predictive value of clinical TNM, contributing to personalized treatment and patient follow-up.The underlying mechanisms may be accessible at the genomic or histological levels [34].
There are several limitations to this study.This was a single-center study with a small sample size and no external validation.We intend to increase the sample size and to conduct multicenter studies to ensure that the model can be applied to a larger population in subsequent studies.Additionally, most patients selected for PET had advanced GC, which may have led to a potential bias.Furthermore, we only studied PET images, but we plan to construct a combined CT and PET model to make full use of the imaging data in the future.

Conclusion
In conclusion, we constructed a PET-based radiomics model and combined it with clinical TNM staging and clinical features to predict OS in patients with GC.PET RS can be effective in predicting patients' clinical outcomes, adding predictive value to clinical TNM staging.

Fig. 2
Fig. 2 The workflow of the study.(a) Partialiegel Rlihood deviation values with respect to different λ values in the LASSO model.(b) Select the optimal λ value.(c) Rank the importance of each feature in the RSF model.PET: Positron emission tomography; LASSO: Least absolute shrinkage and selection operator; RSF: Random survival forest

Fig. 3
Fig. 3 PET RS as a prognostic indicator.(a) Kaplan-Meier curves of OS between high-and low-PET RS groups in the training cohort.(b) Kaplan-Meier curves of OS between high-and low-PET RS groups in the validation cohort.(c) ROC curve of the PET RS in the training cohort.(d) ROC curve of the PET RS in the validation cohort.(e, g) The survival distribution of GC patients with different PET RS in the training cohort.(f, h) The survival distribution of GC patients with different PET RS in the validation cohort.PET: Positron emission tomography; RS: Radiomics score; OS: Overall survival; ROC: Receiver operator characteristics; GC: Gastric cancer

Fig. 4
Fig. 4 PET RS and different clinical features were used to predict OS.(a) Multivariate Cox analysis of PET RS and different clinical features in the training cohort.(b) Nomograms constructed based on PET RS and different clinical characteristics in the training cohort.PET: Positron emission tomography; RS: Radiomics score; OS: Overall survival; CEA: Carcinoembryonic antigen; CA199: Carbohydrate antigen199; TNM: Tumor-node-metastasis.*, P < 0.05

Fig. 5 Fig. 6
Fig. 5 The nomogram correction curve.This shows consistent correction of 1-year and 2-year OS predictions in the training cohort (a) and validation cohort (b).Decision curves were analyzed for each model in all patients with GC to show the survival benefit (C).OS: Overall survival; GC: Gastric cancer; TNM: Tumor-node-metastasis

Table 1
Clinical features of patients according to the PET RS in the training and validation cohorts Note: Unless otherwise stated, data are numbers of patients and percentages are in parentheses.*, P < 0.05; † , Data are median with interquartile range in parentheses; BMI: Body mass index; NRS 2002: Nutritional Risk Screening 2002; CEA: Carcinoembryonic antigen; CA199: Carbohydrate antigen199; TNM: Tumor-node-metastasis; SUVmean: Mean standardized intake value; SUVmax: Maximum standardized intake value; MTV: Metabolic tumor volume; TLG: Total pathological glycolysis

Table 3
OS prediction performance of the three models in the training and the validation cohorts C-index, The Harrell consistency index; CI: Confidence interval; IBS: Integrated Brier score